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Abstract. Cell reproduction involves replication of diverse molecule species, in 
contrast to simple replication system with fewer components. Here, we address why 
such diversity is sustained despite the efficiency of simple replication systems, using a 
cell model with catalytic reaction dynamics that grew by uptake of environmental 
resources. Limited resources led to increased diversity of components within the 
system, and the number of coexisting species increased with a negative power of the 
resource abundances. The diversity was explained from the optimum growth speed 
of the cell, determined by a tradeoff between the utility of diverse resources and the 
concentration onto fewer components to increase the reaction rate. 


1. Introduction 

Diverse molecule species coexist in a cell; these components are encapsulated in 
cellular compartments and synthesized with the aid of catalysts in order to achieve 
cell reproduction by taking up essential nutrients and resources from the environment. 
Indeed, huge diversity in components is achieved. However, when considering the 
theoretical minimum requirements of a cell, a simple cell consisting of fewer components 
has been shown to achieve a more rapid growth speed [I]. Importantly, the collision rate 
of a molecule with its catalyst will be increased if the composition of the cell includes 
fewer species that catalyze the replication of one another. For example, in cell models 
consisting of a catalytic reaction network [2] |3j 01 El [HJ [7J [U [9j TO] , an autocatalytic 
set is formed in which a collection of molecular species that synthesizes themselves 
through catalysis by other molecules within the set, the presence of fewer components 
has been shown to achieve more rapid growth. Several in vitro and in silico models 
have supported these findings [TT1[T2| 13]. In general, cells with simple reaction pathways 
consisting of only a few molecular species would achieve a higher reproduction rate 
than a complex system containing diverse components, and the strategy to increase 
compositional diversity with multiple pathways would be evolutionarily selected out. 
However, in the present cells, there is a huge diversity of molecules. Thus, a theoretical 
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explanation of this diversity is needed in order to improve onr understanding of biological 
systems. 

In the experiments and theoretical models for the cells, it is implicitly assumed 
that resources for the synthesis of molecules are sufficiently abundant. As long as the 
resources are sufficiently available, simplification of the molecular components would be 
more favorable for reproduction than diversification. In contrast, if resources are limited, 
focusing on only a few components using specific resources may not be evolutionarily 
stable. Thus, because of resource limitation, cells with diverse components and multiple 
pathways that manage various resources to achieve growth would be more evolutionarily 
favorable. In fact, the presence of multiple pathways due to diversity of components 
stabilizes an evolving net work [IT] by ensuring that there are alternative routes to achieve 
reproduction. This is analogous to the selection theory of ecological population in 
which there is a tradeoff between increasing the intrinsic growth rate and the carrying 
capacity [15] . 

However, there is an important difference between molecule replication in a cell and 
individual replication in an ecosystem. A cell, an ensemble of molecules, reproduces 
itself, competes for growth, and is a unit for the selection, whereas ecosystem itself is 
not an object for selection. Here we are interested in a hierarchic growth system, in 
which cell growth and molecule replication have to be balanced. Indeed, we will show 
that such a balance leads to the diversity transition in molecular components with the 
decrease in resource abundances as well as a general scaling law between the diversity 
and resource abundances. 

In the present paper, we study the possible relationships between compositional 
diversity and resource abundances, by investigating a simplified cell model consisting of 
a catalytic reaction network in which each molecule replicates by consuming respective 
resources. We first demonstrate that there is a certain threshold for resource abundances 
below which the compositional diversity, i.e., the number of coexisting chemical species 
is increased. This increase follows the negative power law of resource abundances. 
Importantly, we could explain this scaling relationship by the tradeoff between the 
increase in components for the use of diverse resources and the decrease in components 
to increase the reaction rate. 

2. Model 

For a cell to reproduce itself, all catalytic components have to be synthesized with the 
aid of catalysts. Thus a catalytic component i is replicated with the aid of j, while the 
component j is replicated with the aid of some other component. Following this general 
requirement, Eigen introduced the hypercycle model, which consists of catalytic reaction 
Xj + X j —> 2 Xj + Xj[5]. Considering networks of such reactions, Kauffman introduced 
the concept of autocatalytic set, while a protocell model with such reaction networks has 
been investigated [5]. In this general scheme for a cell model with replicating molecules, 
however, resource molecules are not explicitly included, or in other words, resources are 
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fully supplied. 

Here we modified the standard hypercycle model to include the resources. We 
adopted a cell model in which each molecule (Xyj = 1 was replicated, by 

consuming a corresponding resource (SyJ = 1,..., K M )(see Fig. [[]). M tot cells were 
defined; each contains Km species of replicating molecules where some of species possibly 
have a null population. Molecules of each species Xj were replicated with the aid of 
some other catalytic molecule Xj, determined by a random catalytic reaction network, 
by consuming a predetermined resource Sj, one of the supplied resource chemicals 
S k (k = 1,..., K m ), as follows: 

Xj + Xi + Sj-t 2Xj + Xi. (1) 

For this reaction to replicate X j, one resource molecule is needed, and the replication 
reaction does not occur if Sj is less than one. The reaction coefficient is given by the 
catalytic activity ry randomly determined as <y G [0,1] of the molecule species X t . With 
each replication, error occurs with probability /j. This error corresponds to changes in 
some monomers in the polymer sequence and catalytic properties of the molecule. Here, 
for simplicity, for each replication of Xj, the molecule is replaced by a different molecule 
Xi ( l j) with equal probability p,/(K M — 1) where K M is the number of molecule 
species. 

Each cell takes up resources (Sj) by diffusion from its respective resource reservoir. 
From external reservoirs of concentrations Si], the resources (Sj) are supplied into each 
cell by diffusion —D(Sj — Sj). D controls the degree of the uptake rate because the 
resource supply is reduced by decreasing D. We carried out stochastic simulations of 
the model, as detailed in Appendix. 

A random catalytic reaction network was constructed as follows. For each molecule 
species, the density for the path of the catalytic reaction was given by p(which was fixed 
at 0.2), such that each species had pK M reactions on average. Once chosen, an identical 
reaction network was adopted during each simulation for all cells. Even though the 
underlying network did not change, each cell used only a part of the reaction pathways, 
depending on its composition(e.g., both Xj and Xj must be present in the cell for 
reaction (]TJ) to occur). Autocatalytic reactions in which Xj catalyzed the replication 
of itself were excluded from the catalytic network, and direct mutual connections were 
also excluded such that Xj did not function as a catalyst for Xj if Xj was the catalyst 
for Xj. 

When the total number of molecules in each cell exceeded a given threshold N, the 
cell divided into two cells and the molecules were randomly partitioned. One randomly 
chosen cell was removed from the system in order to fix the total number of cells at 
A/tot- 
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3. Results 

3.1. Diversification of composition 

We investigated how diversity in composition changed with the uptake rate of the 
resources D. When the resources were supplied at a sufficiently rapid rate (e.g., 
for .D = l), three components typically dominated most of the composition (each 
representing approximately 1/3 of the molecule population; see Fig[2](A)(a)). Thus, 
the minimum hypercycle was formed by three components [31, 2j as shown in the right 
panel of Fig. |2](A)(a). The hypercycle established a recursively growing state, where 
the composition was robust against stochasticity in reactions and perturbations by the 
division processes. Most of the other molecular species were absent, while some species 
appeared due to replication error. Some parasitic species that were catalyzed by a 
member of the hypercycle but did not catalyze other members were found to increase in 
number on occasion[171 1T8[ fl9l [20], However, cells dominated by the parasitic molecules 
could not continue growth(see Fig. 1 in [16]). All dividing cells adopted this three- 
component hypercycle, and there was no compositional diversity; cells use the minimum 
reaction pathway to grow. 

As D decreases below a certain threshold D c ~ 0.01, the number of molecular 
species increased, and multiple reaction pathways were utilized(Fig[2](A)(b)). Similar 
to the three-component hypercycle, the molecular species in this case also formed a 
mutually catalytic hypercyclc, shown in Fig. [21(A) (b), such that every species in the 
network could be replicated with the aid of other species in the network. All dividing 
cells adopted approximately the same compositions. Moreover, cells dominated by 
parasites appeared on occasion, but could not survive (see Fig. 2 in [T6] ). 

To quantitatively investigate this increase in the number of species, we show in 
Fig. [2](B) the number of major species(more than 10 copies at division) as a function 
of D by using different underlying networks. Irrespective of the network samples, the 
number transits to increase at D c = 0.01 — 0.02, and increases ~ D ~ 1//2 as D decreases 
below this threshold. In contrast, for D > D c , cells were mainly composed of just three 
primary molecular species. 

This transition was estimated by determining the point where the consumption 
rate of resources by the intracellular reactions reached the maximal inflow rate. Beyond 
this critical point, three species typically formed the hypercycle, with each species 
representing approximately 1/3 of the molecule population. Within this system, the 
probability that a species Xj encountered with its catalyst A/ was approximately 1/9, 
and the reaction can occur with a rate Cj/9. On the other hand, the maximum inflow 
rate of resources was DS Thus, the balance point was estimated as D* = Ci/9Sj. 
In our simulation setup, the c t was set at [0,1], but as cells with higher growth 
exhibited improved survival, the q for molecules present in cells was shifted to a 
higher range(~ 0.8; see Fig. 3 in [16]). Likewise, S® G [0,10] was shifted to higher 
values(~ 7 — 8; Fig. 4 in [16]). Hence, the critical point could be approximated as 
D* = 0.011, consistent with the transition in our simulation results. 
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The transition to diversity with multiple pathways was then analyzed in terms of 
dynamical systems. As an illustration, we considered simpler reactions where two sets 
of mutually catalytic molecules A", Y, and Z, Y were initially within the cells. The 
molecular species mutually catalyzed the replication of each other to form a minimal 
hypercycle as follows: 

A + Y + S X ^ 2A + Y, Y + X + S Y -)■ 2 Y + X, (2) 

and 


Z + Y + S z -> 2Z + Y, Y + Z + S z 2Y + Z. 


( 3 ) 


We denoted the intrinsic catalytic activities of X, Y, and Z as cx, cy, and c z , 
respectively. Each reaction to replicate A", Y, and Z consumed the resources Sx, <Sy, 
and S z , respectively. 

The results of stochastic simulations (Fig. [3]( A)) showed that transitions from 
selection of either X or Z to coexistence of X and Z, together with Y, occurred with 
decreasing D. For example, by setting c = cx = c z , we showed that the transition 
point was approximately equal to the balance point where DS° ~ CyXx(z)Xy , where 
Xi for i = A", Y, Z was the concentration of the corresponding component. When the 
resources were sufficiently abundant, either X or Z was selected in a steady state, given 
by xx(z) — cy/(c + cy) and x Y = c/ (c + cy). Then, the consumption rate of the resource 
Sx( Z ) was given as c Y xx(z)Xy = ccy/(c+cy) 2 . Thus, the transition point was estimated 
as D* ~ c/(S°(l + c/Cy ) 2 ), which agreed well with the simulation (see the dotted curve 
in Fig. [31(A)). 

This transition was also clarified by changes in the flow and nullclines in dynamical 
systems of rate equations for cx,cy,c z and resources, as shown in Fig. (3](B) and 
described in §1 of [Tfi] . To analyze this system using rate equations, we assumed that 
volume of the system was constant and that the total number of molecules was fixed. 
Then, the increase in molecular species Y could be written as 
dxy 


dt 


= ( C X X X Xy + C Z X Y Xz)Sy - Xy(ft, 


and the increases for X and Z could be written as, 
dxx 


dt 


= CyXxXySx 


, dXZ C 

X X<P, —TT = CyX Z Xyb Z 

dt 


Xz4>, 


( 4 ) 


( 5 ) 


where Xj is the concentration of species /(A, Y, Z), and (ft is introduced to fix the total 
number of molecules, such that (ft = c Y xy (xySx + x z S z ) + (cxXx + c z x z ) x Y S Y . By 
substituting the steady-state value of each resource into the rate equations, the nullclines 
and flow in the Xx~x z plane were obtained as shown in Fig. (3](B). 

For sufficiently large values of D, the nullclines for A", Y, and Z merged to a single 
line, and the solution was neutral on the line. In this case, even if both X and Z 
were initially present, the system converged on either (xx,x z ) = (1/2,0) or (0,1/2) by 
fluctuations due to stochasticity in the reaction: thus either X or Z remained. As D was 
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decreased, the nullclines split and crossed at a single point. This point corresponded to 
coexistence of both A" and Z, and this bifurcation occurred at DS° = cyXxXY (~ 0.25 
in Fig. EJB)), consistent with our estimates thus far. 


3.3. Scaling behavior and optimum number of molecule species 


Below the transition to coexistence of multiple paths, further resource limitation, as 
shown in Fig. [2](B), increased the compositional diversity with a negative power of D. 
To explain this scaling relationship, we noted the existence of the following tradeoff: 
increasing the diversity in species enabled cells to utilize more resource species for their 
own growth, but decreased the reaction rate resulting from the collision of molecules 
with their catalysts. This tradeoff yielded the optimum number of remaining molecule 
species. 

We estimate the optimal value for the remaining species as K* M (< K M ), which 
achieved maximal growth under conditions of resource limitation. Here, we assumed that 
K m was sufficiently large to assure that K^ could be increased to reach the optimum 
value. Considering that a fixed set of K* M molecule species mutually catalyzed the 
replication of each other, the temporal evolution of iVj, the number of a species X 
was written as ^ ~ XiXjSi. Assuming that the concentrations of K* M molecules were 
approximately the same, the concentration xpy was proportional to ~ 1/K* M . Therefore 
the increase in the number of molecules depended on the number of remaining species 
s as follows: 


dNi 

dt 


K * 2 

1X M 


Si. 


In the steady-state, the resource Si has a value Si defined as 


( 6 ) 


o £S[ 

‘ l/Kg + DSf 

If we assume that S® = S° for all i, the growth rate G of the cell was defined as 

K* m DS° 

Y dt ~ 1 + DSMCfj ' 


Given D and 5°, the optimum value Kff was obtained from dG/dK* M = 0 as 
K°m = ( DS°)~ 1,/ 2 . Hence, as long as Km was sufficiently large to allow the above 
optimal value to be obtained, the scaling relationship (DS°)~ 1//2 was obtained, consistent 

with Fig. EXB). Q 

For the estimate in eq. (151) . we have excluded the possibility that each molecular 
species may have more catalyst species with the increase in the number of remaining 
species. Including this possibility, eq. (EJ) was replaced by = ffjXiXjSi, where the 
summation runs over all the catalysts Xj for X t in the present K* M species. Actually, 


f Note that the nonlinearity of the catalytic reaction, XiXj, was essential for determining the optimum. 
In the linear reaction, i.e., dNi/dt ~ jk-S t , the Kfff may diverge such that the diversity would be 

M 

increased as much as possible, i.e., Km- 
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the average number of catalysts for major molecules gradually increased as the resource 
was limited further(see Fig. 6 in [16]). Further corrections to the scaling relationship 
could be needed for much smaller values of resources(see §2 of [16]) [jj. 

As an example that did not require the above correction for the catalyst number, 
we considered a one-dimensional ring of the mutually catalytic reaction, 

Xi + X i+ i + Si —>• 2 X,- L + Ab+i, -Xj_|_i + Xi + 5 i+ i —> 2X i+ i + X,, 

(i = 1,..., Km) with periodic boundary, i.e., X Km+ i denotes X±. In this case, the number 
of catalysts is two(Aj_i and X i+ \) for each Aj, irrespective of the number of present 
species in the cell. The number of species increased clearly with D _1 / 2 below the balance 
point D* = Ci/4Sj(see §3 and Fig. 5 in [46]). 

4. Summary and discussions 

In summary, we showed that diversification of compositions occurred as a result of 
limitation of various resources, when the maximum inflow and consumption of resources 
were balanced. Using simple reactions, the transition was also clarified by changes of 
nullclines. In addition, a gradual increase ~ D ~ 1//2 in the number of molecular species 
was explained by estimating the optimum number of species to give the maximum 
growth speed of cells. 

Although we used a cell model consisting of hypercycle networks, our ‘diversity 
transition’ is expected to be general for a cell system in which each component in a 
cell is replicated for its growth as a result of catalytic reactions, by consuming external 
resources. Then with the decrease in resource abundances, the diversity in intracellular 
components is increased. Hence, the origin of diverse components in a cell is explained. 

Our study provided a first step to explaining how replicating entities of the catalytic 
reaction network model respond to limitations and diversify their composition. The 
importance of diversity even at the primitive stage of life has been emphasized by 
Dyson[4], while our study suggests the role of multiple resources in the intermediate 
stages from molecules to ecological population. By corresponding replication of each 
molecule species with biological species in an ecosystem, the present diversification 
might have some similarity with the studies in species diversification[21, [22, |23]. In 
spite of similarity, there is one important difference. In our study, cells, as an 
ensemble of molecules, reproduce and those with higher growth speed will be selected, 
whereas ecosystem itself is not a unit for reproduction and selection. Thus there is 
no direct pressure for a simple system with a higher reproduction rate. In our study, 
both molecules and cells are units for reproduction and selection. We expect that 

§ The estimation also assumed that the abundance of major molecule species were approximately 
equal, with a narrow distribution around a common value. This assumption is reasonable in our model 
because all the catalytic activities(ci) and resource abundance(5°) were of the same order; thus the 
number of each molecular species was approximately of the same order. However, if the abundance 
is more broadly distributed, e.g., by the power-law distribution using a different setup[8], the scaling 
exponent could be modified. 
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Figure 1. Schematic representation of our model. The system is composed of Mtot 
cells, each of which contains molecule species Xj [j = l,..., Km) forming a catalytic 
reaction network to replicate each Xj. Each cell takes up resources Sk{k = 1,... ,Km ) 
to consume each for replicating Xk from the resource reservoir in the environment via 
diffusion — D(Sk — S °), where S® is a randomly-fixed constant S® e [0,10] and D is 
the diffusion constant. 


diversity transition with the decrease of resources is a general nature of such hierarchic 
reproduction systems. 

In this study, we examined the compositional diversity of cells based on limitation of 
resources. However, competitions among cells may give rise to additional diversification; 
cellular phenotypes. Indeed, different types of cells utilizing different sets of molecular 
species have developed to allow cell growth to occur. In the case, the growth speed of 
cells and variations in cell compositions, allowing the cells to utilize different resources, 
will be also relevant for survival[24]. 

Appendix 

Simulation methods 

Simulations were carried out as follows. We introduced discrete simulation steps, as 
detained below. For each simulation step, we repeated the following procedures. For 
each cell q (q = 1,..., M to t), we chose two molecules from the cell. If the pair of molecules, 
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X t and Xj were a catalyst and a replicator (Xj catalyzes the replication of Xf), the 
reaction occurred with the given probability^-), if Sf > 1. Sf denotes the resource to 
replicate Xi assigned to each cell q. When the reaction occurred, the new molecule of Xi 
was added into the cell and one molecule of the corresponding resource was subtracted 
to make Sf —> Sf 1. Here, with a probability n, a new molecule of Xi(l ^ instead 
of Xi , was added into the cell, resulting in a structural change. If the total number of 
molecules in a cell exceeded the threshold N, the molecules were distributed into two 
daughter cells, while one cell, randomly chosen, was removed from the system. We also 
updated each Sf to Sf — D(Sf — Sf)(a = 1,..., K M ). 
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Figure 2. (A) The major composition of cells for (a)D = 1 and (b)D = 0.005. The 
number of molecules for the species at division events is shown for 2000 successive 
division events in the system, (a) For D = 1, the three molecule species (36, 66 
and 134) dominated the composition (each approximately with 1/3 of IV = 1000), 
and almost all dividing cells had the same composition. However, some cells were 
dominated by parasitic molecules (species 89 and 100) and could not survive. The 
right panel shows the catalytic network formed by three species (the number indicates 
the species i , and the arrow from species i to j indicates that Xi was a catalyst for 
replicating Xj. (b) For D = 0.005, more species were present in the cells, forming the 
larger network shown in the right panel. Almost all cells had similar compositions, 
while some were dominated by parasitic species (species76 and 173). The parameters 
were as follow: Km = 200, M tot = 100, N = 1000, and y = 0.001. (B) The number 
of major species (more than 10 copies at division) as a function of D for N = 1000 
and 5000. Thin lines with points and thick curves show results of different network 
samples and their averages, respectively. The estimated balance point is shown by the 
arrow. The slope D~ 1 t 2 is also shown as a visual guide. 
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Figure 3. (A) Dominance and coexistence in the simple X — Y and Z — Y case as 
functions of D and c = Cx = Cz- cy was fixed as 1. Points with dependent indicate 
that the outcome depends on the samples. Stochastic simulations were performed with 
same setups of the original model, except that only three species (X, Y, and Z ) were 
considered and underwent the reactions ([2]) and f3|). (B) Nullclines in Xx~xz plane for 
concentrations cry (Red), a; a'(B lue) and xz (Green). Flows are also indicated by arrows. 
Here, cx = cy = cz = 1, Sx = Sy = S° z = S° = 1, V = 1 and xx + xy + xz = 1. 
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1. A simple case - two sets of mutually catalyzing molecules - 

Let us consider three molecule species X, Y, and Z. They replicate through the following 
reactions, 

X + Y + Sx^ 2X + Y, Y + X + S Y 21" + X 

and 

Y + Z + S Y 21" + Z, Z + Y + S z ->■ 2Z + 1". 

We denote the intrinsic catalytic activities of A", Y, and Z, as cx, c Y and cz, respectively. 
Each reaction to synthesize X, 1", and Z utilized the resource Sx, Sy, and Sz, 
respectively. 

The intrinsic reaction rates for replicating the molecule species Y are given by 

Fy = V(fX + tf)s Y , 

respectively. Here, V is the volume of the system, and fy = CxXxXy, fy = CzXyXz, 
where x t = Ni/V(i = X, Y, Z) and N t is the number of molecule species i. The rates of 
replicating X and Z are, respectively, given by 

F x = Vfxsx , F z = Vf z s z 

Here, fx = cyXxXy , and fz = c Y x Y xz■ Si = S t /Sf(i = X,Y,Z) is the normalized 
concentration of the resource where S t is the concentration and Sf is introduced to 
normalize S t to one when S t = Sf. 

The rate equation for molecule species Y is written as, 

^f = F Y = v (fy + /")»>'. (1) 

and those for X and Z are written as, 

dN x _ „ _ T/ f dN z _ f / 0 n 

~77 — F x — V fxSx, —j— — F z — V f z s z ■ (2) 

at at 

The dynamics of resources Sx, Sy, and Sz are respectively written as 

= (fy + fy) *y + DSyi 1 ~ s y)> 

^ = -Vf x sx + DS° x (l-s x ), 

=-Vfzsz+ DS° z {l-s z ). 
at 

In the steady state, the value of s Y , s Y , is written as, 

DSf 

Sr ~ V (f? + /#) + DS°- 
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For large D, s Y —> 1. As D is decreased, s Y starts to decrease and deviates from 1. For 
smaller D, s Y decreases linearly as olD. 

Similarly, for resources Sx and Sz the steady state values are written as 


sx = 


DS\ 


sz = 


DS° Z 


v fx + as-;.’ v fz + ds z 

We assume here that the values of resource are fixed to the steady-state values. 
Then, 

VUy + s?-)ds°y 


Fy - 


r(/i>' + /#) + BS»’ 


„ VfxDSl „ Vf z DS% 

rx ~ TT7 -;— TTFXrNz ~ 


Vf x + DSx ’ Vfz + DS° z 
ffere, the dynamics of each molecule change with the diffusion constant D. When 
D is sufficiently large, each term approaches Fy —> Vfy, and Fx —>• Vfx,F z —> Vfz- 
On the other hand, as D decreases, each term approaches Fy DS$, and F x DS° X , 
F z DS° Z . 

We assumed that the volume V was constant such that 


dN a 

dt 


— F a — x c {Fy + F x + F z ), 


where a = A", Y, Z. In a steady state, the following equations hold, 

F a = x a (F Y + F x + F z ). 


The equations are explicitly written as 

(C X X X + C Z X Z )S^ _ f CyX X S° x 
{c X X X + C Z Xz)Ny + DSy 


+ 


CyX Z S% 


+ 


(cxXx + c z x z )S ( f 


CyX X S% 


x 


CyX x Ny 

CyX Z S° z 


DS X 


= X X 


\CyXxNy + DS^ CyX Z Ny + DS Z Ny(cxXx + C Z X Z ) + DSy 
CyX X S 0 x CyX Z S° z (c X X X + C Z X Z ) S^ 


= x z 


(CyXxNy + DS X 

f c Y x x Sx 


+ 


+ 


CyXzNy + DS° Z 
CyX Z S° z 


+ 


+ 


Ny(c X Xx + C Z X Z ) + DS^r 
('C X X X + C Z X Z )Sy 


CyXzNy + DS Z \CyXxNy + DSx CyX Z Ny + DS z Ny(c X Xx + C Z X Z ) + DSy 

In Fig. 2 in the main text, we show nullclines for Eqs. (P) and ([2D for parameters 
c x = cy = c z = 1 and S x — Sy — S z — 1, V — 1 and x x + Xy + x z = 1. 
For D — 0, the hxed point (. xx,xy,x z ) = (1/3,1/3,1/3) is stable. As D increases, 
the three nullclines change and finally fall into a single line. On the single line, the 
system is neutral and undergoes a random walk to reach either of the dominance points 
(x x ,xy,x z ) = (1/2,1/2,0) and (0,1/2,1/2). 

Explicit forms of the solutions are generally complicated. Therefore, we consider 
here simpler cases in which the diffusion constant D is large or small. For small D 
values, i.e., for {c x x x + c z x z )N Y DSy, cyx x N Y DS X and cyx z N Y DS Z , the 
coexistence solution exists as 


Sf 


S° x + S° Z + 


-,X X = 


S°x 


Y 


S°x + 5 ° + S° z 


,X Z = 


S° z 


S° x + S° + S° z 


xy = 
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On the other hand, in the limit of large D values, i.e., for (cxXx+czXz)N Y <C DSy, 
CyXxNy <C DS° X and CyXzNy <C DSz, 

(c X X X + CzXz ) = Xy {(cy + C X )x X + (cy + C Z )x Z } , 

CyX X = X X {(Cy + C X )x X + (cy + C Z )a; Z } , 

Cyx z = Xz {(cy + C X )x X + (cy + C Z )x Z } , 

therefore, 

(a*, *z) = f —, o) , ( 0, —) - 

V Cx + Cy Cx + Cy J \ Cz + Cy C^ + Cy / 


2. Discussion of the estimate of the optimum number of molecule species 


The number of molecule species in a cell increases as the resources are limited. Therefore, 
the number of catalysts also increases for each species because catalysts are randomly 
assigned for each species from the possible Km species in the model. In this section, we 
discuss the effects of increase of the number of catalysts with the number of molecule 
species in estimating the optimum number of species; the effect is neglected in the main 
text. To estimate how increase in the number of species in a cell changes the growth 
speed, we considered the number of existing species, K XI , out of K M possible molecule 
species. Here, we assumed that K M was sufficiently large to assure that cells could 
increase K* M to the optimum value. We consider a fixed number of molecules with 
Kl T (< Km) molecule species in which the K* M species mutually catalyze the replication 
of each other. The increase in the number of a species X tl Ni, in the molecules is written 
as ~ ^jXiXjSi. Here, x^j) is the concentration of species Xi(Xj), and the term is 
summed for all the catalysts Xj for X t in the present K* M species. If we assume that 
the concentrations of K* M molecules are approximately the same, the concentration x,ju\ 
decreases as ~ 1/A"^ with increasing (up to K M ), if the number of all the molecules 
is fixed. Therefore the increase is estimated as 



where the sum j is taken for all the catalysts Xj of species Xj. Here C(i, K* M ) represents 
the number of catalysts of species Xj in the K* M species. 

The dynamics of a resource S t are then written as 
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If we assume that Sf = S° for all i and C(i, K* M ) is approximately independent on 
each i, the growth rate G of the cell is 



To determine how C(K * M ) increases, we examined the average number of catalyst 
for major species as a function of the number of major species (FigJHI) - The three- 
member cycle corresponds the point (3,1) in the figure. As D decreases, the number of 
species increases so that the points distributed to the right. When the number of species 
is between 3 to 10, the average number of catalysts is distributed rather broadly between 
one and two. This result suggests that the number of catalysts does not strongly depend 
on the number of species in the range, supporting the basic D -1 / 2 scaling. 

As the number of species further increases, corrections to the scaling are needed. 
While the functional form of the increase is not conclusive from the data, we note 
the following corrections. If increases as F(A'^) fa \og(K* M ), the logarithmic 

correction as K*^/\og{K* M ) fa 1/D is needed. If we assume that the function C(K* M ) 
is in the form C(K * M ) = K ///, a correction from the power K^ ~ ZW 1 / 2 is applied. 

r >. 1/(2—a) 

For a < 1, the optimum K* M scales as < Ds°(i-a) f for sma U er D values. When 

the increase in the number of catalysts is negligible, i.e., a = 0, the scaling ZW 1 / 2 is 
reproduced. By approximating the increase based on the power a = 1/2 in Fig. [6l the 
diversity increases with ZW 2//3 . The slope is shown in Fig. [71 

3. A case of one-dimensional chain structure of mutually catalytic 
molecules 

We consider mutually catalytic reactions, 



(i = 1,... ,Km ) with periodic boundary, i.e., Xk m +i denotes X\. We refer to the above 
reactions with X t and X i+1 as Z-th reactions. 

The rate equation for molecule species X % was written as, 



where f\ 1 = Ci-\Xi-iXi , and /* +1 = Ci + \Xi + iXi. Here Xi denotes Ni/N where 
N = JA IVj. The dynamics of resources Si are written as 




(ft 1 + ft') + DSf' 


In a steady state, 
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Here, we assume c % — 1 for all i . For the minimum case when a pair of X,j and X i+ \ 
exists in the system, N t = N i+1 = N/2(i.e.Xi = x i+ i = 1/2), and // _1 = 0, f- +1 = 1/4 
thus, Si = • Then, the increase is written as 

d,N _ dNi | dN i+1 _ DS° | DF/ +1 

~dt ~~dT + dt ~ 1 + 4DS? + 1 + 4.DS)° + i' 

For cases in which more molecule species coexists in the system, we developed 
the following explanation. For a general case where n species coexist(2 < n < Km), 
Ni-n/ 2+1 = ... = Ni = N i+1 = ... = N i+n/ 2 = N/n(i.e. Xj = 1/n), and fj * 1 = 1/n 2 , 

/j +1 = 1/n 2 thus, except that j — i — n/2 + 1, i + n/2. For i — n/2 + 1 

-DS 1 ? 

and % + n/2, s, = 1/n2+ / >s o • Thus, 

div 1 r>e„/ 2 + . ,+ ^ 2 r>s» 1 bs^„ /2 

it X- it n 2 1/n 2 + BS»_„ /2+1 ._.^ 2+2 n 2 2/n 2 + + n 2 1/n 2 + BS“ + „ /2 ' 


For a case S® = S' 0 for all j, the increase dN/dt is written as 

dN _ 2 DS° t (n - 2 )DS° 

~dt ~ 1 + n 2 DS° + 1 + n 2 DS°/2' 


Here, we note that the first term denotes the increase of the two species located at both 
ends of the successive indices, and the second term denotes the sum of the intermediate 
species (for n = 2, there is only the first term). For a more general case in which n 
species coexist in the system, but the species can be divided into subgroups, e.g., first 
group (Ah, X i+1 , ..., X i+ni ) and second group(A j , X j+1 ,..., X j+n2 ) and so on (n = J2 n i)- 
The estimate will be modified such that more species will be categorized into the first 
term and less species will be categorized into the second term. Such subgroups were 
actually observed in our stochastic simulation (Fig. E]), because, with /x = 0, some species 
that were lost by stochasticity in division events never appeared again in the system. 
Our estimate suggests, however, that n species in a single group (successive indices) will 
give the maximum of dN/dt because DS°/( 1 + n 2 DS °) < DS°/( 1 + n 2 DS°/ 2). 

To estimate the optimum number of species n* to achieve the maximum growth 
speed, we consider a function f(x\ n ) = 2x/(l + n 2 x ) + (■n — 2)x/(l + n 2 x/ 2). Here, the 
increase dN/dt is written as f(DS°]n). Then the optimum n* satisfies the condition, 
given a fixed x = DS°, 


df(x ; n) 
dn 


(n — 2 )nx 2 
(1 + n 2 x/ 2) 2 ~*~ 


x 

1 + n 2 xj 2 


4 nx 2 

(1 + n 2 x) 2 


0 . 


The red curve in Fig. [5] shows a numerical solution for n* to satisfy the condition. For 
a small x, 

df(x; n) . 2 . a 2 / , 2 , 

-= — (n — 2)nx + x — 4 nx = — (n + 2)nx + x 

dn 

df /dn = 0 reads (n* + 2)n* = 1/x. This leads to the dependence n* ~ ( DS °) -1 / 2 for 
DS° -)• 0. 



















# of molecules at division # of molecules at division 
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4. Supplementary figures 



# of species 89 at division 



# of species 100 at division 


Figure 1. The number of molecule species as a function of those of species 89 (left) 
and species lOO(right) at the division events shown in Fig. 1(A)(a) of the main text. 
Cells with more parasite species 89 or 100 contain fewer hypercycle members(36, 66, 
and 134). 
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Figure 2. The number of molecule species as a function of those of species 76 (left) 
and species 173(right) at the division events shown in Fig. l(A)(b) of the main text. 
For cells with parasite species 76, species 32, 129 and 147 are relatively preserved to 
produce the parasite, while the numbers of the others decrease remarkably. For cells 
with increased numbers of species 173, all the other species decrease, except species 
71, which replicates with the aid of species 173. 





















Figure 3. The average catalytic activity c,;. Each point corresponds to the average 
value of Ci of the major molecule species. Different colors indicate different network 
samples. Parameters are N = 1000 and fi = 0.001. 



Figure 4. The average values of resource reservoir S Each point corresponds to the 
average value of S',, 0 of the major molecule species. Different colors indicate different 
network samples. Parameters are N = 1000 and /r = 0.001. 




Average number of catalyts for the species 


9 



Figure 5. The number of species as a function of D in the linear-chain network model 
for N = 1000,2000, and 5000. Here, a = 1, 5° = 10 for all i(i = 1,..,A' M ), and 
H = 0. The red curve shows numerical estimation based on solving rate equations(see 
§4). The slope D ~is also shown. In this case, the minimum is a two-species 
hypercycle and the number transit to increase around the estimated balance point 
D* = Ci/4S° = 0.025. 



# of molecule species more than 10 at division 




# of molecule species more than 10 at division 


(b) 


Figure 6. The average number of existing catalyst species for major species 
plotted as a function of the number of major species in (a)log-log and (b)semi- 
log scales. The red, and green points show the results for N = 1000, and 5000, 
respectively, while different symbols show results of different network samples with 
the network path probability fixed to p = 0.2. For each example(symbols), data for 
D = 1,0.5,0.2,0.1,0.05,0.02,0.01,0.005,0.002,0.001 and 0.0001 are plotted. The slope 
a; 1 / 2 is also shown for (a). 
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Figure 7. The number of major species(more than 10 copies at division) as a function 
of D for N = 1000 and 5000. 





